1 Distribution of Durations and Intervals

The distribution of element durations and inter-element intervals from the whale vocal sequences included in this analysis. The times are z-scored within each study to enable direct comparison.

Figure 1: The distribution of element durations and inter-element intervals from the whale vocal sequences included in this analysis. The times are z-scored within each study to enable direct comparison.

2 Effects in Whale Data

Table 1: The effect of sequence length on element/interval duration for each whale species, computed from the base model that excludes position. 2.5% and 97.5% denote the lower and upper bounds of the 95% confidence intervals.

Group

Species

Effect

2.5%

97.5%

Mysticete

Blue Whale

-0.255

-0.331

-0.178

Bowhead Whale

-0.184

-0.318

-0.051

Humpback Whale

-0.678

-0.692

-0.665

Minke Whale

-0.278

-0.294

-0.261

Right Whale

0.309

0.278

0.34

Sei Whale

-0.213

-0.363

-0.063

Odontocete

Bottlenose Dolphin

-0.242

-0.347

-0.138

Commerson's Dolphin

0.221

0.087

0.356

Heaviside's Dolphin

-0.119

-0.32

0.083

Hector's Dolphin

-0.008

-0.274

0.258

Killer Whale

0.121

0.003

0.239

Narrow-Ridged Finless Porpoise

-0.304

-0.338

-0.27

Peale's Dolphin

-0.333

-0.489

-0.177

Risso's Dolphin

-0.42

-0.448

-0.392

Sperm Whale

-0.234

-0.241

-0.226

Table 2: The effect of sequence length and element/interval position on element/interval duration for each whale species, computed from the expanded model that includes both length and position. 2.5% and 97.5% denote the lower and upper bounds of the 95% confidence intervals.

Length

Position

Group

Species

Effect

2.5%

97.5%

Effect

2.5%

97.5%

Mysticete

Blue Whale

-0.255

-0.331

-0.178

-0.064

-0.087

-0.041

Bowhead Whale

-0.179

-0.313

-0.046

-0.751

-0.789

-0.713

Humpback Whale

-0.678

-0.691

-0.665

-0.193

-0.201

-0.186

Minke Whale

-0.278

-0.294

-0.261

-0.017

-0.023

-0.011

Right Whale

0.309

0.278

0.34

0.107

0.096

0.119

Sei Whale

-0.213

-0.364

-0.062

-0.105

-0.139

-0.071

Odontocete

Bottlenose Dolphin

-0.242

-0.346

-0.138

-0.084

-0.111

-0.056

Commerson's Dolphin

0.221

0.087

0.356

-0.106

-0.118

-0.095

Heaviside's Dolphin

-0.119

-0.32

0.083

0.019

0.01

0.027

Hector's Dolphin

-0.008

-0.274

0.258

-0.001

-0.01

0.008

Killer Whale

0.121

0.021

0.221

0.528

0.428

0.628

Narrow-Ridged Finless Porpoise

-0.305

-0.339

-0.271

0.168

0.151

0.185

Peale's Dolphin

-0.333

-0.489

-0.177

-0.013

-0.017

-0.009

Risso's Dolphin

-0.42

-0.448

-0.392

-0.196

-0.2

-0.192

Sperm Whale

-0.234

-0.241

-0.226

0.028

0.026

0.031

3 Effects in Human Data

Table 3: The effect of sequence length on element/interval duration for each human language, computed from the base model that excludes position. 2.5% and 97.5% denote the lower and upper bounds of the 95% confidence intervals.

Language

Effect

2.5%

97.5%

Anal

-0.104

-0.113

-0.095

Arapaho

0.03

0.02

0.04

Asimjeeg Datooga

-0.063

-0.073

-0.053

Baïnounk Gubëeher

-0.102

-0.11

-0.093

Beja

-0.066

-0.073

-0.059

Bora

-0.127

-0.138

-0.116

Cabécar

-0.11

-0.12

-0.1

Cashinahua

-0.1

-0.108

-0.091

Daakie

-0.131

-0.141

-0.121

Dalabon

-0.079

-0.091

-0.066

Dolgan

-0.13

-0.139

-0.121

English (Southern England)

-0.053

-0.066

-0.04

Evenki

-0.101

-0.109

-0.092

Fanbyak

-0.091

-0.101

-0.08

French (Swiss)

-0.05

-0.063

-0.037

Goemai

-0.124

-0.138

-0.11

Gorwaa

-0.125

-0.136

-0.114

Hoocąk

-0.099

-0.109

-0.09

Jahai

-0.062

-0.073

-0.05

Jejuan

-0.093

-0.104

-0.083

Kakabe

-0.123

-0.135

-0.111

Kamas

-0.096

-0.111

-0.082

Komnzo

-0.081

-0.091

-0.071

Light Warlpiri

-0.144

-0.154

-0.133

Lower Sorbian

-0.078

-0.089

-0.067

Mojeño Trinitario

-0.147

-0.156

-0.137

Movima

-0.014

-0.023

-0.006

Nafsan (South Efate)

-0.072

-0.082

-0.062

Nisvai

-0.064

-0.074

-0.053

Nllng

-0.098

-0.111

-0.086

Northern Alta

-0.069

-0.079

-0.059

Northern Kurdish (Kurmanji)

-0.022

-0.033

-0.011

Pnar

-0.021

-0.035

-0.007

Resígaro

-0.079

-0.089

-0.069

Ruuli

-0.039

-0.048

-0.03

Sadu

-0.124

-0.136

-0.112

Sanzhi Dargwa

-0.135

-0.148

-0.122

Savosavo

-0.052

-0.062

-0.042

Sümi

-0.09

-0.101

-0.079

Svan

-0.066

-0.074

-0.057

Tabaq (Karko)

-0.213

-0.222

-0.203

Tabasaran

-0.138

-0.151

-0.125

Teop

-0.17

-0.181

-0.159

Texistepec Popoluca

-0.07

-0.081

-0.059

Urum

-0.126

-0.134

-0.118

Vera'a

-0.152

-0.163

-0.141

Warlpiri

-0.147

-0.157

-0.137

Yali (Apahapsili)

-0.191

-0.202

-0.179

Yongning Na

-0.128

-0.142

-0.114

Yucatec Maya

-0.067

-0.079

-0.056

Yurakaré

-0.198

-0.204

-0.191

Table 4: The effect of sequence length and element/interval position on element/interval duration for each human language, computed from the expanded model that includes both length and position. 2.5% and 97.5% denote the lower and upper bounds of the 95% confidence intervals.

Length

Position

Language

Effect

2.5%

97.5%

Effect

2.5%

97.5%

Anal

-0.105

-0.114

-0.096

0.1

0.091

0.109

Arapaho

0.03

0.02

0.039

0.099

0.09

0.109

Asimjeeg Datooga

-0.063

-0.074

-0.053

0.186

0.177

0.196

Baïnounk Gubëeher

-0.102

-0.11

-0.093

0.096

0.088

0.104

Beja

-0.066

-0.073

-0.059

0.065

0.058

0.072

Bora

-0.127

-0.138

-0.116

-0.131

-0.14

-0.123

Cabécar

-0.11

-0.12

-0.1

0.021

0.012

0.031

Cashinahua

-0.1

-0.108

-0.091

-0.019

-0.028

-0.01

Daakie

-0.131

-0.141

-0.122

0.164

0.155

0.173

Dalabon

-0.079

-0.091

-0.066

0.165

0.153

0.177

Dolgan

-0.13

-0.139

-0.121

0.043

0.034

0.052

English (Southern England)

-0.053

-0.066

-0.04

0.05

0.038

0.062

Evenki

-0.101

-0.109

-0.092

0.042

0.033

0.05

Fanbyak

-0.091

-0.101

-0.081

0.161

0.151

0.171

French (Swiss)

-0.05

-0.063

-0.037

0.16

0.151

0.169

Goemai

-0.124

-0.138

-0.11

0.063

0.052

0.074

Gorwaa

-0.125

-0.136

-0.114

0.009

0

0.018

Hoocąk

-0.099

-0.109

-0.09

0.141

0.132

0.151

Jahai

-0.062

-0.073

-0.05

0.142

0.131

0.153

Jejuan

-0.093

-0.104

-0.083

0.038

0.028

0.049

Kakabe

-0.123

-0.135

-0.111

0.103

0.091

0.115

Kamas

-0.096

-0.111

-0.082

0.003

-0.012

0.017

Komnzo

-0.081

-0.091

-0.071

0.026

0.018

0.035

Light Warlpiri

-0.144

-0.154

-0.133

0.078

0.067

0.088

Lower Sorbian

-0.078

-0.089

-0.067

0.046

0.037

0.056

Mojeño Trinitario

-0.147

-0.156

-0.137

-0.074

-0.083

-0.065

Movima

-0.014

-0.023

-0.006

-0.053

-0.062

-0.045

Nafsan (South Efate)

-0.072

-0.082

-0.062

0.094

0.085

0.103

Nisvai

-0.064

-0.074

-0.054

0.151

0.144

0.159

Nllng

-0.098

-0.111

-0.086

0.155

0.142

0.167

Northern Alta

-0.069

-0.079

-0.059

0.091

0.081

0.101

Northern Kurdish (Kurmanji)

-0.022

-0.033

-0.011

0.033

0.023

0.042

Pnar

-0.021

-0.035

-0.007

0.087

0.076

0.099

Resígaro

-0.079

-0.089

-0.069

0.012

0.003

0.022

Ruuli

-0.039

-0.048

-0.03

0.069

0.061

0.078

Sadu

-0.124

-0.136

-0.112

0.2

0.189

0.212

Sanzhi Dargwa

-0.135

-0.148

-0.122

0.012

0

0.023

Savosavo

-0.052

-0.062

-0.042

-0.125

-0.134

-0.117

Sümi

-0.09

-0.101

-0.08

0.109

0.098

0.12

Svan

-0.066

-0.074

-0.057

-0.026

-0.035

-0.017

Tabaq (Karko)

-0.213

-0.223

-0.203

-0.071

-0.08

-0.061

Tabasaran

-0.138

-0.151

-0.125

-0.025

-0.037

-0.012

Teop

-0.17

-0.181

-0.159

0.072

0.063

0.081

Texistepec Popoluca

-0.07

-0.081

-0.059

0.02

0.011

0.03

Urum

-0.126

-0.134

-0.118

-0.007

-0.015

0.001

Vera'a

-0.152

-0.163

-0.141

0.169

0.16

0.177

Warlpiri

-0.147

-0.157

-0.137

-0.026

-0.035

-0.017

Yali (Apahapsili)

-0.192

-0.203

-0.18

0.14

0.13

0.15

Yongning Na

-0.128

-0.142

-0.114

0.098

0.085

0.112

Yucatec Maya

-0.067

-0.079

-0.056

-0.075

-0.084

-0.066

Yurakaré

-0.198

-0.204

-0.191

-0.035

-0.04

-0.029

4 Words in Sentences

The 95% confidence intervals for the effect of sequence length (top) and position (bottom) on element/interval duration for the 16 whale species and 51 human languages. The human language data are comprised of words within sentences.

Figure 2: The 95% confidence intervals for the effect of sequence length (top) and position (bottom) on element/interval duration for the 16 whale species and 51 human languages. The human language data are comprised of words within sentences.

5 Production Constraint Model

James et al. (1) recently found that Menzerath’s law can be detected in pseudorandom sequences of birdsong syllables that are forced to match the durations of real songs. James et al. (1) interpret their model as approximating simple motor constraints, while stronger effects in the real data would indicate additional mechanisms (e.g., communicative efficiency through behavioral plasticity). I originally planned to compare the strength of Menzerath’s law in the real data with simulated data from the model of James et al. (1), as I recently did for house finch song (2), but analyses of language data suggest that it is far too conservative of a null model. 0 of the 51 of languages in the DoReCo dataset exhibit Menzerath’s law to a greater extent than simulated data. Even though many whale species exhibit Menzerath’s law to a greater extent than simulated data from the null model of James et al. (1) (75%; 12 out of 16 species), I do not want to over-interpret this result given the pattern in the human data. Upon further reflection I think that the fundamental assumption of James et al. (1), that sequence durations are governed by motor constraints alone, is unlikely to apply to many species with more complex communication systems. In humpback whales and sperm whales, for example, there appears to be significant inter-individual variation in song and coda length depending on social context (3,4). More details about this analysis are below.

The production constraint model of James et al. (1) works as follows. For each iteration of the model, a pseudorandom sequence was produced for each real song in the dataset. Syllables were randomly sampled (with replacement) from the population until the duration of the random sequence exceeded the duration of the real song. If the difference between the duration of the random sequence and the real song was <50% of the duration of the final syllable, then the final syllable was kept in the sequence. Otherwise, it was removed. Each iteration of the model produces a set of random sequences with approximately the same distribution of durations as the real data.

For each species, I generated 100 simulated datasets from the (1) random sequence model and the (2) production constraint model. Then, I fit Menzerath’s law separately to each of the 100 simulated datasets and pooled the model estimates for \(a\) and \(b\) using Rubin’s rule as implemented in the mice package in R (5). The results can be seen in Figure 3.

Most importantly, the estimated effects from the production constraint model tend to be more negative than those from the real human language data, suggesting that this null model is far too conservative to be informative about “language-like” efficiency.

The point estimates from the real data alongside 95% confidence intervals from 10 simulated datasets from the production constraint model, for the effect of sequence length on element/interval duration for the 16 whale species and 51 human languages. The human language data are comprised of phonemes within words.

Figure 3: The point estimates from the real data alongside 95% confidence intervals from 10 simulated datasets from the production constraint model, for the effect of sequence length on element/interval duration for the 16 whale species and 51 human languages. The human language data are comprised of phonemes within words.

6 Median Interpolation

The point estimates from the original datasets (red) compared to median-interpolated datasets (blue). Interpolating sequences with the median duration of each element category appears to systematically shift model estimates towards zero (in over 90% of cases).

Figure 4: The point estimates from the original datasets (red) compared to median-interpolated datasets (blue). Interpolating sequences with the median duration of each element category appears to systematically shift model estimates towards zero (in over 90% of cases).

7 Patterned Burst Pulses

Martin et al. (6) noticed that Heaviside’s dolphins produce temporally-patterned burst pulses with much more rhythmic variation in some social situations. I analyzed the 27 patterned burst pulses recorded by Martin et al. (6) and found that they adhere to Menzerath’s law—there is a negative relationship between sequence length and element duration (estimate = -0.186, 95% CI: [-0.308, -0.063]).

References

1.
James LS, Mori C, Wada K, Sakata JT. Phylogeny and mechanisms of shared hierarchical patterns in birdsong. Current Biology [Internet]. 2021 Jul [cited 2023 Apr 12];31(13):2796–2808.e9. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0960982221005285
2.
Youngblood M. Language-like efficiency and structure in house finch song. Proceedings of the Royal Society B. 2024;291(2020):20240250.
3.
Mercado E. Intra-individual variation in the songs of humpback whales suggests they are sonically searching for conspecifics. Learn Behav [Internet]. 2022 Dec [cited 2024 Apr 18];50(4):456–81. Available from: https://link.springer.com/10.3758/s13420-021-00495-0
4.
Hersh TA, Gero S, Rendell L, Cantor M, Weilgart L, Amano M, et al. Evidence from sperm whale clans of symbolic marking in non-human cultures. Proc Natl Acad Sci USA [Internet]. 2022 Sep 13 [cited 2024 Apr 15];119(37):e2201692119. Available from: https://pnas.org/doi/full/10.1073/pnas.2201692119
5.
Buuren SV, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. J Stat Soft [Internet]. 2011 [cited 2024 Mar 8];45(3). Available from: http://www.jstatsoft.org/v45/i03/
6.
Martin MJ, Elwen SH, Kassanjee R, Gridley T. To buzz or burst-pulse? The functional role of Heaviside’s dolphin, Cephalorhynchus heavisidii, rapidly pulsed signals. Animal Behaviour [Internet]. 2019 Apr [cited 2024 Mar 8];150:273–84. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0003347219300089